Prajna: Cloud Service and Interactive Big Data Analytics

نویسندگان

  • Jin Li
  • Sanjeev Mehrotra
  • Weirong Zhu
چکیده

Apache Spark has attracted broad attention in both academia and industry. When people talk about Spark, the first thing that comes to mind is the Resilient Distributed Datasets (RDDs), which lets programmers perform in-memory computations on large clusters in a fault-tolerant manner. While RDD is certainly a great contribution, an overlooked aspect of Spark lies in its harness of functional programming concept (closure, lazy computation, and delegate) in distributed system building. We believe that this is what makes Spark flexible to support different data processing scenario under one common platform and to execute the distributed operation efficiently. We also believe that Spark has just scratched the surface, and the broader use of functional computing concept can fundamentally change how distributed system is built. In this paper, we describe Prajna, a distributed functional programming platform. Prajna is built on top of .Net and F#, and is open source released [31] (Apache license v2.0, note that all components that Prajna depends upon, including both .Net and F# have also been open sourced). Prajna not only supports (and extends) in-memory data analytics on large clusters like that of Spark, but also supports development and deployment of cloud services. Moreover, we show that Prajna can harmonize cloud service and data analytical service, and add rich data analytics on any existing cloud service/application. Prajna supports running of cloud service and interactive data analytics in both managed code and unmanaged code, and supports running of remote code with significant data components (e.g., a recognition model that is hundreds of megabytes in size). And with little programming effort (a day’s work), Prajna allows a programmer to add interactive data analytics to existing cloud services and applications with an analytical turnaround time under a second. The analyzed data (e.g., statuses of the cloud service, users’ inputs, etc.) is processed entirely in-memory during the analytical cycles and never stored to disk. Also, data is first locally accumulated and aggregated before further sending across the network for further cluster-wide aggregation. As such, through Prajna, cloud wide telemetry data becomes available right at users’ fingertip.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Big Data Analytics in Power Distribution Network

Smart grid enhances optimization in generation, distribution and consumption of the electricity by integrating information and communication technologies into the grid. Today, utilities are moving towards smart grid applications, most common one being deployment of smart meters in advanced metering infrastructure, and the first technical challenge they face is the huge volume of data generated ...

متن کامل

Interactive Visual Big Data Analytics for Large Area Farm Biosecurity Monitoring: i-EKbase System

In this industrial application paper a novel application of salad leaf disease detection has been developed using a combination of big data analytics and on field multi-dimensional sensing. We propose a cloud computing based intelligent big data analysis and interactive visual analytics platform to predict farm hot spots with high probability of potential biosecurity threats and early monitorin...

متن کامل

Interactive Debugging for Big Data Analytics

An abundance of data in many disciplines has accelerated the adoption of distributed technologies such as Hadoop and Spark, which provide simple programming semantics and an active ecosystem. However, the current cloud computing model lacks the kinds of expressive and interactive debugging features found in traditional desktop computing. We seek to address these challenges with the development ...

متن کامل

Big Data Collections And Services For Building Intelligent Transport Applications

This paper presents an approach for building data collections and cloud services required for building intelligent transport applications. Services implement Big Data analytics functions that can bring new insights and useful correlations of large data collections and provide knowledge for managing transport issues. Applying data analytics to transport systems brings better understanding to the...

متن کامل

An Investigation of Cloud Computing in Big Data Analytics

This paper describes however cloud huge and large and massive information technologies square measure connection to supply a cheap delivery model for cloud-based big information analytics. Cloud computing may be a powerful technology to perform massive-scale and complicated computing. It eliminates the requirement to keep up pricy computing hardware, dedicated area, and software package. Large ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015